5,235 research outputs found

    Identifying and improving reusability based on coupling patterns

    Get PDF
    Open Source Software (OSS) communities have not yet taken full advantage of reuse mechanisms. Typically many OSS projects which share the same application domain and topic, duplicate effort and code, without fully leveraging the vast amounts of available code. This study proposes the empirical evaluation of source code folders of OSS projects in order to determine their actual internal reuse and their potential as shareable, fine-grained and externally reusable software components by future projects. This paper empirically analyzes four OSS systems, identifies which components (in the form of folders) are currently being reused internally and studies their coupling characteristics. Stable components (i.e., those which act as service providers rather than service consumers) are shown to be more likely to be reusable. As a means of supporting replication of these successful instances of OSS reuse, source folders with similar patterns are extracted from the studied systems, and identified as externally reusable components

    Structural Complexity and Decay in FLOSS Systems: An Inter-Repository Study

    Get PDF
    Past software engineering literature has firmly established that software architectures and the associated code decay over time. Architectural decay is, potentially, a major issue in Free/Libre/Open Source Software (FLOSS) projects, since developers sporadically joining FLOSS projects do not always have a clear understanding of the underlying architecture, and may break the overall conceptual structure by several small changes to the code base. This paper investigates whether the structure of a FLOSS system and its decay can also be influenced by the repository in which it is retained: specifically, two FLOSS repositories are studied to understand whether the complexity of the software structure in the sampled projects is comparable, or one repository hosts more complex systems than the other. It is also studied whether the effort to counteract this complexity is dependent on the repository, and the governance it gives to the hosted projects. The results of the paper are two-fold: on one side, it is shown that the repository hosting larger and more active projects presents more complex structures. On the other side, these larger and more complex systems benefit from more anti-regressive work to reduce this complexity

    Quality factors and coding standards - a comparison between open source forges

    Get PDF
    Enforcing adherence to standards in software development in order to produce high quality software artefacts has long been recognised as best practice in traditional software engineering. In a distributed heterogeneous development environment such those found within the Open Source paradigm, coding standards are informally shared and adhered to by communities of loosely coupled developers. Following these standards could potentially lead to higher quality software. This paper reports on the empirical analysis of two major forges where OSS projects are hosted. The first one, the KDE forge, provides a set of guidelines and coding standards in the form of a coding style that developers may conform to when producing the code source artefacts. The second studied forge, SourceForge, imposes no formal coding standards on developers. A sample of projects from these two forges has been analysed to detect whether the SourceForge sample, where no coding standards are reinforced, has a lower quality than the sample from KDE. Results from this analysis form a complex picture; visually, all the selected metrics show a clear divide between the two forges, but from the statistical standpoint, clear distinctions cannot be drawn amongst these quality related measures in the two forge samples

    Extracting and Comparing Concepts Emerging from Software Code, Documentation and Tests

    Get PDF
    Traceability in software engineering is the ability to connect different artifacts that have been built or designed at various points in time. Given the variety of tasks, tools and formats in the software lifecycle, an outstanding challenge for traceability studies is to deal with the heterogeneity of the artifacts, the links between them and the means to extract each. Using a unified approach for extracting keywords from textual information, this paper aims to compare the concepts extracted from three software artifacts: source code, documentation and tests from the same system. The objectives are to detect similarities in the concepts emerged, and to show the degree of alignment and synchronisation the artifacts possess. Using the components of three projects from the Apache Software Foundation, this paper extracts the concepts from 'base' source code, documentation, and tests (separated from the source code). The extraction is done based on the keywords present in each artifact: we then run multiple comparisons (through calculating cosine similarities on features extracted by word embeddings) in order to detect how the sets of concepts are similar or overlap. For similarities between code and tests, we discovered that using pre-trained language models (with increasing dimension and corpus size) correlates to the increase in magnitude, with higher averages and smaller ranges. FastText pre-trained embeddings scored the highest average of 97.33% with the lowest range of 21.8 across all projects. Also, our approach was able to quickly detect outliers, possibly indicating drifts in traceability within modules. For similarities involving documentation, there was a considerable drop in similarity score compared to between code and tests per module - down to below 5%.</p
    corecore